213 research outputs found

    On focussed and less focussed model selection.

    Get PDF
    Model selection usually provides models without specific concern about for which purpose the selected model will be used afterwards. The focussed information criterion, FIC, is developed to select a best model for inference for a given estimand. For example, in regression models the FIC can be used to select a model for the mean response for each individual subject in the study. This can be used to identify interesting subgroups in the data. Sometimes the FICis considered too much focussed. We rather would want to select a model that performs well for a whole subgroup, or even for all of the subjects in the study. We explain how to make the focussed information criterion a little less focussedvia weighting methods.Data; Focussed information criterion; Information; Methods; Model selection; Models;

    Missing covariates in logistic regression, estimation and distribution selection.

    Get PDF
    We derive explicit formulae for estimation in logistic regression models where some of the covariates are missing. Our approach allows for modeling the distribution of the missing covariates either as a multivariate normal or multivariate t-distribution. A main advantage of this method is that it is fast and does not require the use of iterative procedures. A model selection method is derived which allows to choose amongst these distributions. In addition we consider versions of AIC that are based on the EM algorithm and on multiple imputation methods that have a wide applicability to model selection in likelihood models in general.Akaike information criterion; Likelihood model; Logistic regression; Missing covariates; Model selection; Multiple imputation; t-distribution;

    S-estimation and a robust conditional Akaike information criterion for linear mixed models.

    Get PDF
    We study estimation and model selection on both the fixed and the random effects in the setting of linear mixed models using outlier robust S-estimators. Robustness aspects on the level of the random effects as well as on the error terms is taken into account. The derived marginal and conditional information criteria are in the style of Akaike's information criterion but avoid the use of a fully specified likelihood by a suitable S-estimation approach that minimizes a scale function. We derive the appropriate penalty terms and provide an implementation using R. The setting of semiparametric additive models fit with penalized regression splines, in a mixed models formulation, fits as a specific application. Simulated data examples illustrate the effectiveness of the proposed criteria.Akaike information criterion; Conditional likelihood; Effective degrees of freedom; Mixed model; Penalized regression spline; S-estimation;

    Order selection tests with multiply-imputed data.

    Get PDF
    We develop nonparametric tests for the null hypothesis that a function has a prescribed form, to apply to data sets with missing observations. Omnibus nonparametric tests do not need to specify a particular alternative parametric form, and have power against a large range of alternatives, the order selection tests that we study are one example. We extend such order selection tests to be applicable in the context of missing data. In particular, we consider likelihood-based order selection tests for multiply- imputed data. A simulation study and data analysis illustrate the performance of the tests. A model selection method in the style of Akaike's information criterion for multiply imputed datasets results along the same lines.Akaike information criterion; Hypothesis test; Multiple imputation; lack-of-fit test; Missing data; Omnibus test; Order selection;

    An asymptotic theory for model selection inference in general semiparametric problems.

    Get PDF
    Recently, Hjort and Claeskens (2003) developed an asymptotic theory for model selection, model averaging and post-model selection/averaging inference using likelihood methods in parametric models, along with associated confidence statements. In this paper, we consider a semiparametric version of this problem, wherein the likelihood depends on parameters and an unknown function, and model selection/averaging is to be applied to the parametric parts of the model. We show that all the results of Hjort and Claeskens hold in the semiparametric context, if the Fisher information matrix for parametric models is replaced by the semiparametric information bound for semiparametric models, and if maximum likelihood estimators for parametric models are replaced by semiparametric efficient profile estimators. The results also describe the behavior of semiparametric model estimates when the parametric component is misspecified, and have implications as well for pointwise consistent model selectors.Aikake information criterion; Bayse information criterion; Behavior; Efficient semi-parametric estimation; Estimator; Frequentist model averaging; Implications; Information; Matrix; Maximum likelihood; Methods; Model; Model averaging; Model selection; Models; Problems; Profile likelihood; Research; Selection; Semiparametric model; Theory;

    Goodness-of-fit tests in mixed models.

    Get PDF
    Mixed models, with both random and fixed effects, are most often estimated on the assumption that the random effects are normally distributed. In this paper we propose several formal tests of the hypothesis that the random effects and/or errors are normally distributed. Most of the proposed methods can be extended to generalized linear models where tests for non-normal distributions are of interest. Our tests are nonparametric in the sense that they are designed to detect virtually any alternative to normality. In case of rejection of the null hypothesis, the nonparametric estimation method that is used to construct a test provides an estimator of the alternative distribution.Mixed model; Hypothesis test; Nonparametric test; Minimum distance; Order selection;

    A multiresolution approach to time warping achieved by a Bayesian prior-posterior transfer fitting strategy.

    Get PDF
    The procedure known as warping aims at reducing phase variability in a sample of functional curve observations, by applying a smooth bijection to the argument of each of the functions. We propose a natural representation of warping functions in terms of a new type of elementary function named `warping component functions' which are combined into the warping function by composition. A sequential Bayesian estimation strategy is introduced, which fits a series of models and transfers the posterior of the previous fit into the prior of the next fit. Model selection is based on a warping analogue to wavelet thresholding, combined with Bayesian inference.Bayesian inference; Functional data analysis; Markov chain Monte Carlo sampling; Time warping; Warping components; Warping function;

    Nonparametric estimation of mean and dispersion functions in extended generalized linear models.

    Get PDF
    In this paper the interest is in regression analysis for data that show possibly overdispersion or underdispersion. The starting point for modeling are generalized linear models in which we no longer admit a linear form for the mean regression function, but allow it to be any smooth function of the covariate(s). In view of analyzing overdispersed or underdispersed data, we additionally bring in an unknown dispersion function. The mean regression function and the dispersion function are then estimated using P-splines with difference type of penalty to prevent from overfitting. We discuss two approaches: one based on an extended quasi-likelihood idea and one based on a pseudo-likelihood approach. The choices of smoothing parameters and implementation issues are discussed. The performance of the estimation method is investigated via simulations and its use is illustrated on several data, including continuous data, counts and proportions.Double exponential family; Extended quasi-likelihood; Modeling; Overdispersion; Pseudo likelihood; P-splines; Regression; Variance estimation; Underdispersion;

    Bootstrapping for penalized spline regression.

    Get PDF
    We describe and contrast several different bootstrapping procedures for penalized spline smoothers. The bootstrapping procedures considered are variations on existing methods, developed under two different probabilistic frameworks. Under the first framework, penalized spline regression is considered an estimation technique to find an unknown smooth function. The smooth function is represented in a high dimensional spline basis, with spline coefficients estimated in a penalized form. Under the second framework, the unknown function is treated as a realization of a set of random spline coefficients, which are then predicted in a linear mixed model. We describe how bootstrapping methods can be implemented under both frameworks, and we show in theory and through simulations and examples that bootstrapping provides valid inference in both cases. We compare the inference obtained under both frameworks, and conclude that the latter generally produces better results than the former. The bootstrapping ideas are extended to hypothesis testing, where parametric components in a model are tested against nonparametric alternatives.Methods; Framework; Regression; Linear mixed model; Mixed model; Model; Theory; Simulation; Hypothesis testing;
    corecore